Facing The Machine Translation Babel in CLIR – Can MT Metrics Help in Choosing CLIR Resources?

نویسنده

  • Kimmo Kettunen
چکیده

This paper describes usage of MT metrics in choosing the best candidates for MT-based query translation resources in Cross-Language Information Retrieval. Our metrics is METEOR. Language pair of our evaluation is English → German, because METEORmetrics does not offer very many language pairs for comparison. English → German has also available many MT programs that can be used in evaluation. We evaluated translations of CLEF 2003 topics of twelve different MT programs with MT metrics and compare the metrics evaluation results to mean average precision results of CLIR runs. Our results show, that for long topics the correlations between achieved MAPs and MT metrics are high (0.88), and for short topics lower but still clear (0.59). Overall it seems that METEOR can easily distinguish the worst MT programs from the best ones, but smaller differences are not so clearly seen. Some of the intrinsic properties of METEOR metrics do not also suit for CLIR resource evaluation purposes, because some properties of the translation metrics, especially evaluation of word order, are not significant for CLIR resource evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing the Best MT Programs for CLIR Purposes - Can MT Metrics Be Helpful?

This paper describes usage of MT metrics in choosing the best candidates for MT-based query translation resources. Our main metrics is METEOR, but we also use NIST and BLEU. Language pair of our evaluation is English German, because MT metrics still do not offer very many language pairs for comparison. We evaluated translations of CLEF 2003 topics of four different MT programs with MT metrics a...

متن کامل

1 On Bidirectional English - Arabic Search

In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is ...

متن کامل

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew's Collection

For the CLEF 2004 ImageCLEF St Andrew’s Collection task the Dublin City University group carried out three sets of experiments. We carried out standard cross-language information retrieval (CLIR) runs using topic translation using machine translation (MT), combination of this run with image matching results from the VIPER system, and a novel document rescoring approach based on automatic MT eva...

متن کامل

Should MT Systems Be Used as Black Boxes in CLIR?

The translation stage in cross language information retrieval (CLIR) acts as the main enabling stage to cross the language barrier between documents and queries. In recent years machine translation (MT) systems have become the dominant approach to translation in CLIR. However, unlike information retrieval (IR), MT focuses on the morphological and syntactical quality of the sentence. This requir...

متن کامل

Indonesian-Japanese CLIR Using Only Limited Resource

Our research aim here is to build a CLIR system that works for a language pair with poor resources where the source language (e.g. Indonesian) has limited language resources. Our IndonesianJapanese CLIR system employs the existing Japanese IR system, and we focus our research on the IndonesianJapanese query translation. There are two problems in our limited resource query translation: the OOV p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009